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DETAILED ACTION 

Continued Examination Under 37 CFR 1.114 

1. A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
10/31/07 has been entered. 

Response to Arguments 

2. Applicant's arguments filed 10/31/07 have been fully considered but they are not 
persuasive. 

Applicant argues that neither Bennett et al., nor Murveit et al., teach or suggest 
receiving a speech utterance from a user and then extracting characteristics about the 
user from content of the speech; and selecting a single one of the ASR engines to 
recognize the speech utterance (Amendment, pages 6 - 8). 

The examiner disagrees, Bennett et al., teach that a user calls into the system 
and navigates the menus using control keywords and then starts a dictation process. 
Additionally, a variety of recognizers are optimized for dictation may be available, for 
example. If the system knows that the user is dictating a legal memo based on the 
current state of the dialog, it may use the legal-dictation optimized recognizer 
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(paragraph 33, lines 8-21). Using the legal-dictation optimized recognizer for dictating 
a legal memo implies receiving a speech utterance from a user and then extracting 
characteristics about the user from content of the speech; and selecting a single one of 
the ASR engines to recognize the speech utterance, since the recognizer is selected 
based on the current state of the dialog. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made^ 

4. Claims 1 - 8, 14 - 20 rejected under 35 U.S.C. 103(a) as being unpatentable 
over Bennett et al., (US PAP 2002/0194000), in view of Murveit et al., (US Patent 
7,058,573) 

As per claims 1, 8, and 14, Bennett et al., teach an automatic speech recognition 
(ASR) that comprises: 

providing a plurality of categories ("American male") for different speech 
utterances; assigning a different ASR engine to each category ("recognizers that have 
good performance for American men southern accents be enabled") based on the ranks 
of the ASR engines("select the best recognizer and its results"; paragraph 15, lines 6 - 
9; paragraph 19; paragraph 20, lines 7-9; Abstract, lines 7, and 8); 
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processing the different speech utterances at different ASR engines ("the speech 
recognition system enable some of the speech recognizers and received results"; 
abstract, lines 4-6) 

receiving a first speech utterance ("receiving the input stream") from a first user; 
(paragraph 12, lines 1, and 2; paragraph 19, lines 10 - 12); and 

extracting characteristics about the first user from content of the first speech 
utterance to classify the first speech utterance into one of the categories; and selecting 
a single one of the ASR engines assigned to the category to which the first speech 
utterance is classified to automatically recognize the first speech utterance ("a user calls 
into the system and navigates the menus using control keywords and then starts a 
dictation process. Additionally, a variety of recognizers are optimized for dictation may 
be available, for example. If the system knows that the user is dictating a legal memo 
based on the current state of the dialog, it may use the legal-dictation optimized 
recognizer"; paragraph 33, lines 8-21). 

However, Bennett et al., do not specifically teach receiving ground truths with 
correct text for the different speech utterances; and comparing output from the each of 
the different ASR engines with the ground truths to determine ranks of the different ASR 
engines for accuracy in recognizing the different speech utterances. 

Murveit et al., teach assuming the spoken input is the word, "Boston". The 
assigned score is a probability or is related to the probability that the corresponding 
expression correctly corresponds to the spoken input. The expression with the highest 
assigned score or certainty is selected as the output (probability that the corresponding 
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expression correctly corresponds to the spoken input implies comparing output from the 
each of the different ASR engines with the ground truths to determine ranks of the 
different ASR engines for accuracy in recognizing the different speech utterances, since 
the highest score is selected among all the assigned scores; col. 2, lines 56, and 57; 
col.5, lines 21 - 23; col.9, lines 22 - 24). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to assign scores based on expression correctly 
corresponds to the speech input as taught by Murveit et al., in Bennett et at., because 
that would maintain a high degree Of recognition accuracy in a speech recognition 
system (col. 2, lines 33, and 34). 

As per claims 2, and 15, Bennett et al., further disclose providing a plurality of 
categories for different speech utterances further comprises providing a male category 
and a female category ("gender"; paragraph 19, lines 10-12; paragraph 31, line 3). 

As per claim 3, Bennett et al., further disclose assigning a different ASR engine 
to each category further comprises assessing accuracy of each ASR engine for each 
category (" accuracy of each recognizer in a particular situation"; paragraph 22, lines 8, 
and 9). 

As per claims 4, and 16, Bennett et al., further disclose assessing accuracy of 
each ASR engine for each category further comprises determining a least Word Error 



Application/Control Number: 10/668,121 Page 6 

Art Unit: 2626 

Rate of each ASR engine for each category ("a recognizer with a recognizer-based 
confidence value of 90%"; paragraph 42, lines 3, and 4). 

As per claim 5, Bennett et al., further disclose assigning a different ASR engine 
to each category further comprises assessing time required for each ASR engine to 
recognize speech utterances ("performance overtime"; paragraph 42, line - paragraph 
43, line 3). 

As per claim 6, Bennett et al., further disclose receiving a second speech 
utterance from a second user; classifying the second speech utterance into one of the 
categories; and selecting the ASR engine assigned to the category to which the second 
speech utterance is classified to automatically recognize the speech utterance, wherein 
the ASR engine assigned to the category to which the second speech utterance is 
classified is different from the ASR engine assigned to the category to which the first 
speech utterance is classified (using characteristics of the communication channel and 
contextual information such as gender to enable some of the recognizers among a 
plurality of recognizers, implies that it is inherent to classify another speech to another 
category; paragraph 20; paragraph 17; paragraph 31, line 3). 

As per claim 7, Bennett et al., further disclose that the first speech utterance is 
classified into a male category, and the second speech utterance is classified into a 
female category ("gender"; paragraph 19, lines 10-12; paragraph 31, line 3). 
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As pre claim 17, Bennett et al., further disclose at least three different ASR 
engines and at least three different combination schemas of ASR engines to represent 
a total of at least six different ASR engines ("processing cell phone audio stream with 
some recognizers among multiple recognizers"; paragraph 10, lines 2, and 3; paragraph 
16, lines 2 -4). 

As per claim 18, Bennett et al., further disclose that a telephone network 
comprising at least one switching service point coupled to the computer system ("output 
switch 16"; paragraph 4, lines 8 -10; paragraph 10; paragraph 13, line 3). 

As per claim 19, Bennett et al., further disclose that at least one communication 
device in communication with the switching service point to provide the speech 
utterance ("cell phone connection" paragraph 10; paragraph 13, line 3). 

5. Claims 9-13, and 20 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Bennett et al., (US PAP 2002/0194000) in view of Murveit et al., (US 
Patent 7,058,573). 

As per claims 9, and 20, Bennett et al., in view of Murveit et al., do not 
specifically teach storing a ranking matrix, the ranking matrix comprising a plurality of 
different categories of speech signals and a plurality of different ASR engine 
corresponding to the plurality of different categories. 
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However, since Bennett et al., teach selecting the recognizers that are better for 
cell phone audio streams than other recognizers (paragraph 16, lines 2 - 4), one having 
ordinary skill in the art would found it obvious to use a ranking matrix comprising a 
plurality of different categories of speech signals and a plurality of different ASR engine 
within Bennett et al., because that would determine the recognizers that would be 
enabled for a particular input stream (paragraph 16, lines 5, and 6). 

As per claim 10, Bennett et al., further disclose different categories are selected 
from the group consisting of gender, noise level, and pitch ("signal strength"; paragraph 
15, line 7; paragraph 31, line 3). 

As per claim 1 1 , Bennett et al., further disclose different ASR engines comprise 
single ASR engines ("single recognizer") and multiple ASR engines combined together 
(paragraph 21, lines 1, and 2; paragraph 20, lines 7, and 8). 

As per claim 12, Bennett et al., further disclose the plurality of different ASR 
engine rankings are derived from statistical analysis ("performance history of the 
particular recognizer"; paragraph 23, line 5). 

As per claim 13, Bennett et al., further disclose that the statistical analysis 
comprises assessing accuracy of speech recognition of different ASR engines with 
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different speech signals ("accuracy of each recognizer in a particular situation"; 
paragraph 22, lines 8, and 9). 



6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Leonard Saint-Cyr whose telephone number is (571) 

272- 4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is (571)- 

273- 8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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